Skip to content

[fix](gpt-oss): fix quark quantized model in moe bias#787

Merged
valarLip merged 1 commit into
mainfrom
quant_gpt_oss
May 18, 2026
Merged

[fix](gpt-oss): fix quark quantized model in moe bias#787
valarLip merged 1 commit into
mainfrom
quant_gpt_oss

Conversation

@PerryZhang01

@PerryZhang01 PerryZhang01 commented May 14, 2026

Copy link
Copy Markdown
Contributor

Motivation

This PR fixed the padding error in quantized gpt_oss. the quantized gpt-oss-120b is from quark team(https://huggingface.co/amd/gpt-oss-120b-moe-ori-attn-ptpc), it only quantized gemm weights in attention with PTPC methods. the bias in moe are padding, using empty tensor will introduce dirty data, so use zero bias data.

image

@valarLip valarLip merged commit aa7c25a into main May 18, 2026
67 of 85 checks passed
@valarLip valarLip deleted the quant_gpt_oss branch May 18, 2026 09:05
sijyang pushed a commit that referenced this pull request May 24, 2026
Co-authored-by: perzhang <perzhang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants